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Embedded density functional theory (e-DFT) is used to describe the electronic structure of strongly interacting 
molecular subsystems. We present a general implementation of the Exact Embedding (EE) method [J. Chem. 
Phys. 133, 084103 (2010)] to calculate the large contributions of the non-additive kinetic potential (NAKP) 
in such applications. Potential energy curves are computed for the dissociation of Li+-Be, CH3-CF3, and 
hydrogen-bonded water clusters, and e-DFT results obtained using the EE method are compared with those 
obtained using approximate kinetic energy functionals. In all cases, the EE method preserves excellent 
agreement with reference Kohn-Sham calculations, whereas the approximate functionals lead to qualitative 
failures in the calculated energies and equilibrium structures. We also demonstrate an accurate pairwisc 
approximation to the NAKP that allows for efficient parallelization of the EE method in large systems; 
benchmark calculations on molecular crystals reveal ideal, size-independent scaling of wall-clock time with 
increasing system size. 
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I. INTRODUCTION 

Important methodological challenges in electronic 
structure theory include the long-timescale simulation of 
ab initio molecular dynamics and the seamless combina- 
tion of high- and low-level electronic structure methods 
in complex systems. Methods that exploit the intrinsic 
locality of molecular interactions have demonstrated en- 
couraging progress towards these goals^— 

In particular, orbital-free embedded density functional 
theory (e-DFT) offers a formally exact approach to elec- 
tronic structure theory in which the interactions be- 
tween subsystems are evaluated in terms of their elec- 
tronic densities Recent work has demonstrated that 
constructing the embedded subsystems from individual 
molecules leads to a linear-scaling electronic structure 
approach that maps naturally onto distributed-memory 
parallel computers ) 14 ' 19 and it provides a systematic 
framework for calculating electronic excited states in con- 
densed phase systemsi 20 ' 21 However, approximate treat- 
ments of the non-additive kinetic potential (NAKP) limit 
the accuracy of this approach in applications involving 
strongly interacting subsystems* 2 ^. For example, severe 
artifacts in the structure of liquid water, including the 
complete absence of a second peak in the oxygen-oxygen 
radial distribution function, have been predicted from 
existing approximations to the NAKP^i 9 - and e-DFT ap- 
plications involving covalently bonded embedded sub- 
systems have also been shown to qualitatively fail^Sr— 
The development of improved methods to address the 
NAKP problem will open new doors for on-the-fly, mas- 
sively parallel electronic structure calculations in general, 
condensed-phase systems. 

In this paper, we describe progress towards the devel- 
opment of accurate, scalable treatments for the NAKP in 
e-DFT. We provide the first molecular applications of our 
recently developed Exact Embedding (EE) method^ 
demonstrating that it successfully describes the break- 



ing of covalent bonds and hydrogen bonds with chemical 
accuracy. Additionally, we introduce and numerically 
demonstrate a pairwise approximation to the NAKP, 
which allows for the scalable implementation of the EE 
method in large systems. Benchmark calculations are 
presented for systems with up to 125 molecules, demon- 
strating that parallel implementation of the method en- 
ables constant system-size scaling of the wall-clock cal- 
culation time. 



II. THEORY 

A. Orbital-free Embedded DFT 

We utilize the orbital-free e-DFT formulation of 
Cortona 1 and Wesolowski and coworkers^ For the case 
in which the total electronic density pab is partitioned 
into two subsystems, pab = PA + Pb, the correspond- 
ing one-electron orbitals obey the Kohn-Sham Equations 
with Constrained Electron Density (KSCED)^ 
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where i = 1, . . . , N A , j = 1, . . . , N B , and N A and N B 
are the number of electrons in the respective subsystems. 
v c ff is the effective potential for the coupled one-electron 
equations, such that 

v c ft [p A , Pab ; r] = v nc (r ) + uj [pab ; r] + v xc [pab ; r] 

+ v nad [p A ,PAB;r], (3) 
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and E xc [p] is the exchange-correlation functional. 
Vnad [pa , Pab ', r] is the potential due to the non-additive 
kinetic energy for non-interacting electrons, such that 



Vna.d[pA, PAB\r] 
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SpA 



W, (7) 



where r s nad [p A ,p B ] = T s [p AB ] - T s [p A ] - T s [p B ] . The sub- 
system densities are constructed from the corresponding 



KS orbitals, using p A (r) = J^tLi \4>t Ml 2 and p B (r) = 
YljLi ( r )| 2 - Eqs. [T][7] are easily generalized for the 
e-DFT description of multiple embedded subsystems*^ 
Two aspects of e-DFT are worth emphasizing. Firstly, 
like conventional Kohn-Sham density functional theory 
(KS-DFT), it is a theory that is exact in principle, but 
practical calculations must employ approximations to the 
unknown exchange-correlation functional. Secondly, un- 
like conventional KS-DFT, the embedding formulation 
introduces the NAKP, u na d [pa , Pab ; r] , since the orbitals 
for subsystem A are not necessarily orthogonal to those 
of subsystem B. Without knowledge of the exact func- 
tional for the non-interacting kinetic energy, this creates 
a second source of approximation in e-DFT calculations. 
The significance of the NAKP is system-dependent, with 
the most severe cases including those for which the sub- 
system densities greatly overlap; no approximate kinetic 
energy functional has been previously demonstrated to 
yield accurate results for embedded subsystems that are 
connected by covalent bonds i^ ' 22 i 23 ' 26 i 27 



B. Exact Calculations of NAKP 

We have recently developed the Exact Embedding 
(EE) method to calculate the NAKP in the e-DFT 
framework^ 5 - The general method can be summarized 
for two embedded subsystems as follows: A Levy con- 
strained search (LCS) 28 or equivalent technique is first 
used to determine the full set of orthogonal KS orbitals, 
{w B }; that correspond to the total density /jab from the 
latest iteration of Eqs. [T]02 Then, from the KS orbitals 
{4>f B }, {<j>f }, and {4>f}, the corresponding kinetic po- 
tentials are calculated using the exact result of King and 
Handy,— 
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where n is the number of occupied orbitals, q is the KS 
eigenvalue corresponding to orbital <f>i, and p is a con- 
stant. Finally, the NAKP needed for the next iteration 
of Eqs. QJSlis calculated directly from the difference 



Wnad[/?A,PAB;r] = v Ts (r) - u£ (r) 



(9) 



where the superscripts in this equation indicate the or- 
bital set to which each kinetic potential corresponds. 

Rather than explicitly performing the LCS, we use 
the equivalent protocol of Zhao, Morrison, and Parr 
(ZMP)22r— to obtain the exact non-interacting kinetic 
energy and the KS orbitals {</>f B }. This requires solu- 
tion of the following one-electron equations 



+ v cxt {r)+v*(r) 
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in the limit A — > oo, where i = 1, . . . , (N A + N u ), and 



«£(r) = A / ^) 7 PAB(r) dr , 



(11) 



^ext(r) corresponds to any well-behaved external 
potential ) 31 ' 32 and various choices for this potential are 
described in Sec. Ill B. In practice, Eq. [TU] is solved at 
several large, finite values of A, and the KS orbitals and 
eigenvalues, as well as the final non-interacting kinetic 
energy, are obtained via extrapolation.— _ — In Sec. V, 
we discuss a technique to robustly implement the ZMP 
step for NAKP calculations in large systems. 

The EE method outlined in Eqs. [8] - [TT] is unique in 
that it allows for the formally exact calculation of the 
total electronic density within the e-DFT framework, us- 
ing integer orbital occupancies and without approxima- 
tions to the NAKP. The method was previously demon- 
strated for atomic systems with strongly overlapping sub- 
system densities^ and the current paper presents its 
first molecular applications. We note that several other 
groups have also used density inversion techniques to cal- 
culate the NAKP, assuming that the total electron den- 
sity is already available from another electronic structure 
calculation] 24 i 33 ' 34 In particular, Visscher and coworkers 
have applied this approach to molecular systems with the 
aim of developing improved non-additive kinetic energy 
functionalSfS 4 . Furthermore, Partition DFT has been in- 
troduced as a formally exact embedding scheme in which 
subsystem densities are described using partially occu- 
pied orbitals, and it has been applied to one-dimensional 
model systems.® 



III. IMPLEMENTATION DETAILS 

We have implemented e-DFT in the Molpro quan- 
tum chemistry package, 35 allowing for calculation of the 
NAKP with either approximate functionals or the EE 
method. In this section, methodological and numerical 
aspects of the implementation are discussed. 
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A. Supermolecular vs. Monomolecular Basis Sets 

The atom-centered basis sets used to solve the KSCED 
(Eqs. Q] and [5]) are implemented using two different 
conventionS i 22 i 36 In the monomolecular basis set conven- 
tion, the density for each embedded subsystem is de- 
scribed using only the basis functions that are centered 
on atoms belonging to that subsystem. In the super- 
molecular basis set convention, the density for each em- 
bedded subsystem is described using the same basis set, 
which includes functions that are centered on all atoms 
in the system. The supermolecular basis set convention 
provides a closer approximation to the complete basis set 
limit, although it is more costly. 



B. ZMP Step 

In our implementation, the ZMP step of the EE 
method is performed by solving Eq.[lO]for six large, finite 
values of A. The KS orbitals {c/> AB } are then obtained 
from extrapolation of the atomic orbital coefficients for 
the {4>t\}i usm g a third-order polynomial in A -1 , and 
normalization of the extrapolated orbitals is enforced a 
posteriori. The KS eigenvalues {e AB } are similarly ob- 
tained from extrapolation of the {e AB }- T s [pab] is cal- 
culated analytically from the extrapolated orbital coeffi- 
cients, which ensures that the total energy from the EE 
method is bound from below by the KS-DFT energy. 

In the limit A — > oo, the solutions to Eq. [TUJare inde- 
pendent of the choice of external potential u ext (r)^2r— 
although u cxt (r) does affect the convergence with increas- 
ing A. Various options where thus considered, including 



VextO) = Vne(r), 
iw(r) = ^ne(r) + ( 1 



1 



(12) 

_/V A + N B ' Wj ^ AB;r l' ( 13 ) 
Vcxt(r) = v nc {r) +wj[p A B;r] + v xc [p AB ;r}. (14) 

At every iteration of the KSCED, these versions of v ex ±(r) 
are all available without the need for additional computa- 
tion. Test calculations have indicated that the external 
potential in Eq. [TJ] leads to the fastest convergence of 
the extrapolation with increasing A, and this potential is 
used in all results for the EE method reported in Sec. IV. 



a density-based criterion to switch from the exact ex- 
pression for the kinetic potential to a numerically stable 
approximation, such as the Thomas-Fermi (TF) kinetic 
potential. The protocol used to perform this switching is 
described below. 

In a first step, we calculate the constant shift that is 
needed to match the exact result for each kinetic po- 
tential to the corresponding TF result in a prescribed 
switching region. Specifically, for each of the kinetic po- 
tentials (i.e., fr a (r) S {v^ B (r),v^ (r),w B (r)} which cor- 
respond respectively to p(r) e (pAB(r), p A {r), Pb(i")}), 
the average difference (A e {A AB ,A A ,A B }) between 
the results from Eq. [8] and from the TF functional is 
evaluated in the vicinity of the p(r) = p' density isosur- 
face. Each A is computed over gridpoints in the region 
i < f[p;r] < (1-0, where 
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3 K(p(r)-p') _|_ I ' 



(15) 



£, k, and p' are parameters that define the switching 
region, and the relative contribution from each gridpoint 
is weighted according to 



>[p;r 



— g-K(PAB(r)-p(r))^ 



(16) 



Note that the weighting function in Eq. |TB]is uniform for 
the case of p = pab', for cases in which p is one of the 
subsystem densities, w[p;r] preferentially selects values 
for which p(r) ss /?ab(i")- 

In a second step, each kinetic potential is computed on 
the grid; this is done by vertically shifting the exact result 
with the corresponding A and then smoothly switching 
to the TF result at densities below p' , using the density- 
based switching function f[p;r] in Eq. [T5] Finally, the 
NAKP is calculated from the smoothly switched kinetic 
potentials using Eq. |9] The vertical shifts that are ap- 
plied to kinetic potentials simply give rise to an additive 
constant in the final NAKP, which has no physical ef- 
fect. Although we find that switching to the TF func- 
tional at low densities is both convenient and accurate, 
the protocol described above could be performed using 
any approximate kinetic energy functional. 



IV. RESULTS: SMALL SYSTEMS 



C. NAKP Numerics for Regions of Weak Density Overlap 

Numerical evaluation of the kinetic potential from 
Eq. [8] is unstable in regions for which the correspond- 
ing density vanishes. The problem is exacerbated by 
the incorrect distance dependence of the low-density tails 
obtained from calculations using Gaussian-type orbitals 
(GTOs). 29 However, these numerically treacherous re- 
gions correspond to weak overlap between subsystem 
densities, where the magnitude of the NAKP is neces- 
sarily small and easily approximated^ We thus utilize 



A. Calculation Details 

In this section, e-DFT calculations are presented for 
the dissociation curves of (H20)2 and the covalently 
bound Li + -Be and CH3-CF3 molecules; standard KS- 
DFT calculations are included for comparison. All re- 
sults are obtained using the Molpro quantum chemistry 
package^ with KS-DFT available in the standard ver- 
sion and with the e-DFT method implemented in our 
modified version. In the e-DFT calculations, the NAKP 
is described using either the EE method or the approx- 
imate TF2L28 an d LC94 39 kinetic energy functionals; 
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these approaches will hereafter be referred to as e-DFT- 
EE, e-DFT-TF, and e-DFT-LC, respectively. 

All calculations in this section are performed using 
the B88-P86 exchange-correlation (XC) functiona h 40 ' 41 
Both the XC functional and the NAKP are evaluated 
on a grid of Becke-Voronoi 4 ^ cells with resolution to 
limit the integration error of Slater exchange to 10~ 12 
Hartree; the grid is generated using the Molpro directive 
GPJD=1CT 12 . 

The KSCED in Eqs. [JJ3] are initialized from the gas 
phase density of each subsystem, and the eigensolutions 
for each set of equations are updated at every iteration. 
Convergence of these equations is improved with the 
molecular orbital (MO) shifting and direct inversion of 
iterative subspace (DBS) algorithms . 43 ' 44 For the water 
dimer, an MO shift of -0.5 Bartrcc is employed, whereas a 
-1.0 Bartree shift is used for Li + -Be and CB 3 -CF 3 . Since 
the DBS algorithm leads to slow final convergence^ it 
is discontinued once the root mean squared difference 
(RMSD) of total density matrix elements changes by less 
than 5 x 10~ 4 between two successive iterations. The 
KSCED equations are deemed converged when the total 
energy of the system changes by less than 10~ 6 Bartree 
and the RMSD in the total density matrix is smaller than 
10~ 5 between two successive iterations. 

For the ZMP step, extrapolation of the solutions to 
Eq. [TUJ is performed using A = 7 + rj, where j = 
0, 1, . . . , 5. Unless otherwise noted, calculations for the 
water dimer and Li + -Be employ 7 = 5000 and r = 100, 
whereas calculations for CB3-CF3 employ 7 = 100 and 
t = 10. To reach adequate convergence, Eq. QJJis solved 
in several stages. Firstly, a coarse solution is reached 
by using an MO shift of — 10 3 Bartree and a value of 
A = 100. Subsequently, using this coarse solution as a 
starting point, the Eq. [TUJ solved using a smaller MO 
shift of —84 Bartree and with A = 7. Finally, solution of 
Eq. [TUJ for each increasing value of A needed for extrapo- 
lation employs the solution for the prior value of A as a 
starting point. The DBS algorithm is used throughout. 
The orbitals from Eq.[TU]are deemed converged when the 
RMSD in the density matrix was smaller than 10~ 9 be- 
tween two successive iterations; significantly tighter con- 
vergence is needed for the ZMP equations than for the 
KSCED, to ensure an accurate extrapolation. 

Calculations for the water dimer variously employ the 
aug-pc-3, aug-pc-2, and aug-pc-1 basis sets^ in each 
case using only the s- and p-type functions for the hy- 
drogen atoms and the s-, p-, and d-type functions for the 
oxygen atoms. These water dimer basis sets are here- 
after referred to as the modified aug-pc-3, aug-pc-2, and 
aug-pc-1 basis sets, respectively. Calculations involving 
Li + -Be use the s-, p-, and d-type functions of the com- 
bined aug-pc-4 and cc-pVQZ (core/valence) basis sets4£ 
In calculations for CB3-CF3, the C atoms are described 
using the s-, p-, and d-type functions of the combined 
aug-pc-4 and cc-pV6Z (core/valence) basis sets^ and 
the B and F atoms are described using the full aug-pc-1 
basis set4£ Sensitivity of the e-DFT calculations to the 
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FIG. 1. The water dimer dissociation curve, obtained using 
e-DFT-EE (red, dot-dashed), e-DFT-TF (green, dashed) and 
e-DFT-LC (blue, dotted). Also included are reference refer- 
ence KS-DFT results (black, solid), which are graphically in- 
distinguishable from the e-DFT-EE results. Total energies are 
plotted with respect to the KS-DFT minimum of -152.430722 
Hartree. Inset, the curves are shifted vertically to align the 
energy minima and horizontally to align the equilibrium dis- 
tances. 



basis set is discussed in the next section. 

Larger basis sets provide a better description of low- 
density regions, allowing for the use of smaller values for 
the parameter p' in Eqs. 1151 and !16l and providing robust- 
ness with respect to the choice of this parameter. For the 
water dimer, calculations using aug-pc-3, aug-pc-2, and 
aug-pc-1 basis sets employ values of p' = 10~ 5 , 10 -4 , and 
5 x 10~ 3 , respectively. For Li + -Be and CB3-CF3, calcu- 
lations employ p' = 10~ 6 . In each case, £ = 10~ 4 , and 
the parameter k in Eqs. [15] and [TU] is chosen such that 
up' = 10. 



B. Water Dimer 

Fig. [TJ presents the dissociation curve for the water 
dimer, a system with a strong hydrogen bond and sig- 
nificantly overlapping subsystem densities. The curve 
is obtained using e-DFT-EE (dot-dashed), e-DFT-TF 
(dashed), and e-DFT-LC (dotted); KS-DFT results 
(solid) are also included for reference. The e-DFT calcu- 
lations are performed using two embedded subsystems, 
each corresponding to a different molecule in the dimer. 
All calculations presented in the figure utilize the modi- 
fied aug-pc-3 basis set, with the e-DFT calculations em- 
ploying the supermolecular basis set convention. The 
dissociation curve is plotted as a function of the oxygen- 
oxygen distance, with the equilibrium water dimer ge- 
ometry obtained from a KS-DFT energy minimization 
and with other geometries obtained by displacing the two 
molecules along the oxygen-oxygen vector while fixing all 
other internal coordinates. 

The e-DFT-EE results in Fig. [JJ agree well with KS- 
DFT throughout the range of dissociation distances. Nu- 
merical results for the two methods are graphically indis- 
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FIG. 2. Basis set dependence of the water dimer dissocia- 
tion curve, illustrated for calculations using the (A) modified 
aug-pc-2 and (B) modified aug-pc-1 basis sets. Results for 
the e-DFT-EE, e-DFT-TF, e-DFT-LC, and KS-DFT meth- 
ods are reported as in Fig. [T] Total energies are plotted 
with respect to the KS-DFT minimum energies of -152.953947 
Hartree (panel A) and -152.864441 Hartree (panel B). 
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FIG. 3. The Li+-Be dissociation curve. Results for the e- 
DFT-EE, e-DFT-TF, e-DFT-LC, and KS-DFT methods are 
reported as in Fig.[T] The results for e-DFT-EE and the refer- 
ence KS-DFT results are graphically indistinguishable. Total 
energies are plotted with respect to the KS-DFT minimum 
energy of -21.962072 Hartree. Inset, the curves are aligned as 
in the inset of Fig. [T] 



tinguishable, and the calculated total energies differ by 
less than 0.5 kcal/mol throughout the entire attractive 
branch of the curve. Exact numerical agreement between 
the e-DFT-EE and KS-DFT descriptions is expected only 
in the limits of an exact XC functional and a complete 
basis set. 

The sensitivity of the e-DFT results to approximations 
in the NAKP is clearly demonstrated in Fig.[TJ The curve 
obtained using e-DFT-TF differs significantly from the 
KS-DFT reference, exhibiting a dissociation energy that 
is underestimated by 40% (~4 kcal/mol) and an equilib- 
rium bond length that is 0.15 A too long. Calculations 
obtained using e-DFT-LC are somewhat improved, al- 
though the dissociation energy is still overestimated by 
20% (~2 kcal/mol) and the equilibrium bond length is 
underestimated by 0.10 A. In the inset of Fig. [1] the 
curvature of the potential energy surfaces in the vicin- 
ity of the minimum are compared, revealing significant 
deviations of the results obtained using the approximate 
NAKP treatments (e-DFT-TF and e-DFT-LC) with re- 
spect to the results obtained using KS-DFT and e-DFT- 
EE. 

Iannuzzi and coworkers- 9 have demonstrated that e- 
DFT calculations using approximate treatments of the 
NAKP, including the TF and LC94 functionals, lead 
to qualitative failure in describing the structure of liq- 
uid water. Fig. [T] illustrates the origin of this failure in 
terms of the pairwise interactions among molecules, and 
it suggests that e-DFT-EE will enable the accurate, first- 
principles simulation of liquid water and aqueous solu- 
tions. Critical to this effort, however, is the efficient and 
parallelizable implementation of the EE method for large 
systems, which is discussed in Section V. 

The sensitivity of the e-DFT calculations to basis set 
completeness is illustrated in Fig. [5J in which the water 
dimer dissociation curves are recalculated using the mod- 
ified aug-pc-2 (Fig. OA.) and modified aug-pc-1 basis sets 



(Fig. [IB)- Comparison of the KS-DFT results and the e- 
DFT-EE results reveals that the agreement between the 
methods worsens with smaller basis set; of course, both 
the KS-DFT calculations and the e-DFT-EE calculations 
are basis-set dependent. In the e-DFT-EE calculations, 
smaller basis sets give rise to numerical artifacts includ- 
ing the oscillatory behavior in the King-Handy expression 
for the kinetic potential^ For the modified aug-pc-1 ba- 
sis set (Fig. [2)B), the reasonable agreement between KS- 
DFT and e-DFT-LC is due to a fortuitous cancellation of 
errors from the approximate NAKP functional and the 
small basis set. 

C. Li+-Be 

We now consider the heterolytic cleavage of a weak 
covalent bond, Li + -Be— >-Li + +Be, using KS-DFT and e- 
DFT. The e-DFT calculations were performed in the su- 
permolecular basis set convention using two embedded 
subsystems, one corresponding to the 2-electron Li ion 
and the other corresponding to the 4-electron Be atom. 
The dissociation curve for Li + -Be is plotted in Fig. [3J 

As is seen from the main figure, the e-DFT-EE cal- 
culations accurately reproduce the total energies from 
KS-DFT throughout the entire range of internuclear dis- 
tances. The dissociation curves for these two methods, 
which are graphically indistinguishable in Fig. [3J devi- 
ate by less than 0.2 kcal/mol throughout the range of 
separations and the dissociation energy deviates by only 
0.07 kcal/mol. In contrast, the e-DFT-TF results arc 
in qualitative disagreement with the KS-DFT reference 
calculations; in addition to dramatically overestimating 
the dissociation energy of the molecule by approximately 
12.5 kcal/mol, the method predicts the equilibrium bond 
length to be 20% too short. Interestingly, the e-DFT-LC 
method performs significantly worse in this application. 
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FIG. 4. The CH3-CF3 dissociation curve for heterolytic cleav- 
age of the C-C bond. Results are presented for the e-DFT-EE 
(red, dot-dashed) and KS-DFT (black, solid) methods. Total 
energies are plotted with respect to the KS-DFT minimum 
energy of -377.575687 Hartree. Inset, the curves are aligned 
as in the inset of Fig. [T] 



tance to within 1.5 kcal/mol, and the embedding method 
also recovers the reference value for the equilibrium bond 
distance. Furthermore, as is clear from the inset, e-DFT- 
EE accurately reproduces the curvature of the energy 
surface in the vicinity of the equilibrium bond distance. 
In contrast, the e-DFT-TF and e-DFT-LC descriptions 
for this system fail dramatically, predicting total energies 
at the equilibrium bond distance that deviate from the 
KS-DFT reference by approximately 730 kcal/mol and 
980 kcal/mol, respectively. For calculations with such 
strongly interacting subsystems, the failure of e-DFT 
with approximate descriptions for the NAKP methods 
has been previously observed^ However, the results for 
e-DFT-EE in Fig. @] demonstrate significant progress in 
the accurate description of covalently interacting subsys- 
tems using e-DFT. 



V. RESULTS: EXTENSION TO LARGER SYSTEMS 



The calculations based on the approximate LC94 kinetic 
energy functional overestimate the dissociation energy by 
approximately 16 kcal/mol and predict the equilibrium 
bond length to be 25% too short. The inset to Fig. [3] 
illustrates that both e-DFT methods that use approxi- 
mate treatments for the NAKP lead to an overestima- 
tion of the energy surface curvature in the vicinity of the 
equilibrium bond distance. 

The results in Fig. [3] illustrate the well-known break- 
down of e-DFT with approximate treatments of the 
NAKP for applications involving strongly overlapping 
subsystem densities. They further show that our EE 
method overcomes this large error, yielding the first nu- 
merical demonstration of an e-DFT method to describe 
covalent bond-breaking with chemical accuracy. Since 
e-DFT-EE is a formally exact method, this result is ex- 
pected. However demonstration that the level of accu- 
racy in Fig.[3]can be achieved in practical numerical simu- 
lations constitutes a non-trivial validation of the method. 



D. CH3-CF3 

In a more challenging application for e-DFT, we con- 
sider the heterolytic cleavage of a strong carbon-carbon 
CT-bond, CH 3 -CF 3 -> CH+ + CF3 . The e-DFT calcula- 
tions were again performed in the supermolecular basis 
set convention using two embedded subsystems, one cor- 
responding to the 8-electron CH^ moiety and the other 
corresponding to the 34-electron CFg" moiety. The geom- 
etry for the lowest energy point along the curve is pro- 
vided in the supplemental information; the dissociation 
curve in Fig. 2] is plotted by extending the C-C distance 
while keeping all other internal coordinates unchanged. 

The dissociation curves in Fig. [4] are presented only for 
e-DFT-EE and the reference KS-DFT calculations, e- 
DFT-EE reproduces the KS-DFT reference value for the 
total energy for the molecule at the equilibrium bond dis- 



A. Pairwise treatment of the NAKP 

In the previously described implementation of e-DFT- 
EE, the ZMP step, or an equivalent LCS, is performed 
on the full system of interest. However, numerical chal- 
lenges limit the LCS to systems with less than 10-15 
atoms j 33 ' 34 i 48 ~ — potentially hindering the applicability 
of e-DFT-EE in large systems. To avoid this problem, 
we demonstrate a pairwise approximation for the NAKP 
that enables the scalable implementation of e-DFT-EE. 

For a system composed of A su b embedded subsystems, 
{p a }, the non-additive kinetic energy can be approxi- 
mated using a pairwise sum^SS, such that 

Tr d [{ Pa }]^T s [p}~j2 T ^ ( 17 ) 

a=l 

~ ( T s[Pa +P/3] ~ T s [ Pa ] - T s \fi ]) , 

where p = Y,Z=i P<*- The NAKP for a given subsystem 
a is then 

Wsub 

fnad[Pa,{Pa};r] = ^ (v%P (t) -V^(r)). (18) 

Applying the EE method to this approximation for the 
NAKP, a ZMP step is performed at each iteration of the 
KSCED to obtain the KS orbitals corresponding to each 
pair of subsystems densities, {</>f /3 }- Then, using both 
the subsystem KS orbitals {<fi?} from the KSCED and 
the subsystem-pair KS orbitals {</>" /3 }, the NAKP is eval- 
uated directly from Eqs. [8] and [18] In this approach, only 
the NAKP is assumed to be pairwise additive; all other 
interactions in the system are treated with full general- 
ity. Since the ZMP step is applied only to the subsystem 
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pairs, this approach is numerically feasible if each subsys- 
tem is limited to a relatively small number of atoms, re- 
gardless of the total system size. The short-ranged nature 
of contributions to the non-additive kinetic energy sug- 
gests that distance-based cutoffs can be employed within 
the sum over subsystem pairs j^ 5 . 

It was emphasized earlier that the converged results of 
the ZMP step are independent of the choice of external 
potential, i; ext (r), in Eq. 1101 In the pairwise implemen- 
tation of e-DFT-EE for the water trimer in Sec. V B, we 
employ the following external potential for each pair of 
densities p a and pp, 

Uext(r) = Vnc(r) + r] + v xc [p; r] 

I STslp] Sf s [p a + pp] 

S(p a + Pfi) S(p a + pp) 

where T s indicates the approximate TF functional. This 
external potential approximates the KSCED effective po- 
tential (Eq. [3]) for the pair of subsystems embedded 
within the remainder of the full system; note that the 
TF functional is used only to regularize the effective po- 
tential for the ZMP step; it does not introduce any addi- 
tional approximation into the e-DFT-EE calculation. In 
Sec. V C, we use a simple external potential that includes 
only the electron-nuclear interactions for the subsystem 
pair. 

The following two sections demonstrate the accuracy 
of this pairwise implementation of e-DFT-EE (Sec. V B) 
and the efficiency with which it can be implemented in 
parallel (Sec. V C). 

B. Water Trimer Application: Testing Pairwise Additivity 
in the NAKP 

Fig. [5] presents a test of pairwise additivity in the 
NAKP (Eq. IT5j) for a hydrogen-bonded trimer of water 
molecules. e-DFT-EE calculations are performed using 
three embedded subsystems, each corresponding to a dif- 
ferent molecule in the trimer. In a first set of results, 
the symmetric dissociation curve for the trimer is cal- 
culated using no assumptions about the NAKP (solid); 
in a second set of results, the curve is calculated us- 
ing the assumption of pairwise additivity of the NAKP 
(dot-dashed). The equilibrium geometry is provided in 
the supplemental information; other geometries along 
the dissociation curve were then obtained by uniformly 
stretching the oxygen-oxygen distances in the cluster, 
keeping all other internal coordinates unchanged. The 
trimer calculations were performed using the modified 
aug-pc-2 basis set with the monomolecular basis set con- 
vention; all other calculation details are identical to those 
described previously for the modified aug-pc-2 calcula- 
tions of the water dimer. 

The agreement between the two curves in Fig. [5] indi- 
cates that Eqs. [FT] and [15] are excellent approximations 
for the non-additive kinetic energy and NAKP, respec- 
tively. Throughout the entire attractive branch of the 
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FIG. 5. Symmetric dissociation curves for the water trimer, 
illustrating the pairwise additivity of the NAKP. Calculations 
are performed using the e-DFT-EE method, with no approx- 
imation to the NAKP (black, solid) and with the pairwise 
approximation to NAKP (red, dot-dashed). The curves are 
plotted as a function of the sum of the three O-O distances, 
with details of the molecular geometries provided in the text. 
Total energies plotted with respect to the minimum energy of 
-229.440307 Hartree for the full NAKP treatment. Inset, the 
difference between the two curves is plotted. 

curve the total energies differ by less the 0.5 kcal/mol, 
and the largest deviations appear only in the strongly re- 
pulsive region at short distances. This good agreement is 
particularly notable, given that the cyclic trimer geome- 
tries might be expected to magnify possible non-additive 
contributions to the total energy; even better adherence 
of the NAKP to pairwise additivity is expected for linear 
geometries of the trimer. We have previously noted that 
higher-order corrections to Eqs. [17] and [18] are possible, 25 
although the results in Fig.[5]suggest that the assumption 
of pairwise additivity will be adequate in many cases. 



C. Parallel Scaling of e-DFT-EE 

Primary bottlenecks in KS-DFT include calculation 
of the two-electron integrals and solution of the eigen- 
value problem. In standard implementations, the two- 
electron integral calculations scales as M 4 and the eigen- 
value calculation scales at best as M 2 , where M is 
the total number of basis functions. 52 ' 53 More efficient 
methods for computing the two-electron integrals in- 
clude prescreening^i Ewald summations^ 5 - and the fast- 
multipole method^ however, solution of the eigenvalue 
problem remains a computational bottleneck in most KS- 
DFT implementations^ 

As has been noted in previous work^ the monomolec- 
ular basis set convention leads to advantageous scaling 
properties for e-DFT. In this convention, the number of 
basis functions used to solve each KSCED, M su b, is in- 
dependent of system size. Consequently, the total cost of 
the eigenvalue problem scales linearly with the number 
of subsystems, N su b, and it can be trivially parallelized 
to the subsystem level. 
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The cost of the two-electron integral calculation is 
also reduced in the monomolecular basis set convention. 
Terms arising from orbitals centered on molecules in more 
than two different subsystems are exactly zero, such that 
the total cost of this operation scales with N^ uh M^ uh . 
Furthermore, in this convention, the density for each sub- 
system is spatially localized, such that short-ranged con- 
tributions to the KSCED effective potential, including 
exchange, correlation, short-ranged electrostatic contri- 
butions, and pair-wise contributions to the NAKP, can be 
truncated at a cutoff distance. Long-ranged electrostatic 
contributions to the KSCED effective potential can be 
efficiently treated using Ewald summations or other stan- 
dard methods i 55 ' 56 Setting aside these long-ranged terms 
for the current demonstration, the use of distance-based 
cutoffs reduces the scaling of the total two-electron inte- 
gral calculation to N su bM^ uh , which can be parallelized 
to yield constant wall-clock time scaling with increasing 
system size. 
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FIG. 6. Wall-clock timings for lattices of hydrogen molecules, 
ranging in size from 8 to 125 Eh molecules. The dotted 
blue lines indicate ideal quadratic and linear scaling, the 
solid, black curve corresponds to the serial implementation 
of integral-prescreened KS-DFT in Molpro, and the dashed, 
red curve corresponds to e-DFF-EE using a number of parallel 
processors equal to the number of molecules in the system. 



To illustrate these scaling properties, Fig. [6] presents 
benchmark timings for simple tetragonal lattices of 8 to 
125 H 2 molecules, using both e-DFT-EE and the KS- 
DFT implementation in Molpro. The H2 molecules are 
oriented parallel to the z axis, with a bond length of 0.8 
A, and the centers-of-mass for the molecules are spaced 
by 3.0 A along the x and y axes and by 3.8 A along the z 
axis. All calculations employ the uncontracted STO-3G 
basis set^ Slater exchange^ without electron correla- 
tion, and a grid density that ensures an integration error 
in the exchange energy of less than 10~ 6 Hartree. The e- 
DFT-EE calculations are performed with each molecule 
defined as a different subsystem, using the monomolecu- 
lar basis set convention, and using one parallel processor 
per subsystem. Values for the parameters A, p', k, and 
the MO shift are the same as those used for the Li + -Bc 
system. The cutoff for the calculation of the electrostat- 
ics, exchange, and NAKP terms is set to 4.0 A in these 
calculations, such that only nearest-neighbor molecules 
in the lattice contribute to these terms. All calculations 
are performed on a cluster of dual, quad-core 2.6 GHz 
Xeon Intel processors with Infiniband communication. 



The timings in Fig. £0 indicate that the e-DFT-EE 
wall-clock time scales independently of the system size, 
with the deviations at small sizes due the boundaries 
of the finite crystal. As expected, the KS-DFT results 
in the serial Molpro implementation with integral pre- 
screening scales quadratically with the increasing sys- 
tem size. In Fig. [7j relative energy of the e-DFT-EE 
and the KS-DFT calculations are plotted as a func- 
tion of the number nearest-neighbor pairs in the lattice, 
A^airs = 3(Asub — A^ ). The error is small and indepen- 
dent of system size. The integrated error in the density 
per molecule was found to behave similarly (not shown). 
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FIG. 7. Error in the total energy of the e-DFT-EE calculation 
relative to KS-DFT for increasing system size, plotted with 
respect to the number of nearest-neighbor pairs. 

CONCLUSIONS 

We introduce a general implementation of the EE 
method for calculating NAKP contributions in the e- 
DFT framework, and we present a range of molecular ap- 
plications. The accuracy of e-DFT-EE is demonstrated 
for systems with covalently bonded and hydrogen-bonded 
subsystems. For the dissociation of the water dimer and 
the covalent bonds in Li+-Be and CH3-CF3, e-DFT-EE 
preserves excellent agreement with reference KS-DFT 
calculations, whereas approximate treatments for the 
NAKP, including those based on the TF or LC94 ki- 
netic energy functionals, lead to known failures. Fur- 
thermore, pairwise approximation of the NAKP yields 
excellent accuracy for the hydrogen-bonded water trimer, 
and it enables ideal, constant system-size scaling in ap- 
plications to molecular clusters with up to hundreds of 
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atoms. These results establish e-DFT-EE as a promis- 
ing methodology for performing accurate, first-principles 
molecular dynamics and for accurately embedding high- 
level wavefunction methods in complex systems. 
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